Working with data from an API

Web APIs are now a common way to interact with data, and many governments now have open data portals that offer access via API. Socrata is a common vendor.

Here, we're going to tap into the API feed of a dataset of vacant buildings in St. Paul.

Import the modules we need


In [ ]:
import json
import requests

Fetch the page and get the JSON


In [ ]:
# URL
URL = 'https://information.stpaul.gov/resource/rfbb-x7za.json'

# use the json() method, which converts the json into Python objects
vb_data = requests.get(URL).json()

# print to see what we're working with
print(vb_data)

Filter the data

Looks like we're dealing with a list of dictionaries. Maybe our goal here is to filter out everything except the vacant single-family residences.

Let's use a new thing called a list comprehension -- really handy when you want to filter a group of things and store the result in a variable.


In [ ]:
sfr_vb = [x for x in vb_data if x['dwelling_type'] == 'Single Family Residential']
print(len(sfr_vb), 'SFR of', len(vb_data), 'total')

Exercise

From the original data set, filter out everything except vacant buildings that were vacant as of (vacant_as_of) 2013 or later. Select whatever elements of the data are interesting to you and write out to a CSV file.

Breaking down the problem:

  • Filter the data to include only buildings vacant as of 2013 or later
    • Use slicing to isolate the year from the vacancy date
    • Coerce that year string to an integer
    • In a list comprehension, use an if statement to compare whether the year is greater than or equal to 2013
  • Open a file to write to
  • Loop over that filtered list of data
  • Select the elements of the data that you think belong in your CSV and write them out

In [ ]: